In this homework we will focus on another method for explaining black-box models - LIME. We will use the same dataset as in the previous homework, the famous heart disease dataset. We will use the same models as in the previous homework: the Xgboost and Logistic regression. In this notebbok we will cover the following topics:
1) Calculation of LIME explanation for a given observation using packages lime and dalex. 2) Comparison of LIME for diffrent observations in the dataset. 3) Comparison of LIME and SHAP for the same observation. 4) In detail comparisation of LIME between Xgboost and Logistic regression models, to see if there are any systematic differences between the explanations.
LIME stands for Local Interpretable Model-agnostic Explanations.
1) Model agnosticism refers to the property of LIME using which it can give explanations for any given supervised learning model by treating as a ‘black-box’ separately. This means that LIME can handle almost any model. 2) Local explanations mean that LIME gives explanations that are locally faithful within the surroundings or vicinity of the observation/sample being explained.
I will use Xgboost model to predict the heart disease of two selected patients.
The first patient, number 56, is a Male, aged 48 years, with chest pain type 0 (typical angina), resting blood pressure of 122 mm Hg, cholestoral of 222 mg/dl, fasting blood sugar of 0 (false), resting electrocardiographic results of 0 (normal), maximum heart rate achieved of 186, exercise induced angina of 0 (no), ST depression induced by exercise relative to rest of 0.0, the slope of the peak exercise ST segment of 2, number of major vessels of 0, and thalassemia of 2 (normal).
The model predicts that the patient has a ~0.99 probability of NOT having a heart disease. The model's prediction is correct, as the patient does not have a heart disease.
According to the LIME explanation, the most important features that led to a prediction of no heart disease for patient 56 are:
1) caa_0 = 1 - number of major vessels of 1 - effect ~ 0.39 - it had positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has 1 major vessels.
2) cp_0 = 1 - chest pain type 0 (atypical angina) - effect ~ -0.23 - it had negative effect on the prediction, meaning that the patient has a higher probability of having a heart disease if he has chest pain type 1 (atypical angina).
3) sex_0 = 0 - not being female (being male) - effect ~ -0.185 - it had negative effect on the prediction, meaning that the patient has a higher probability of having a heart disease if he is male.
4) thall_2 = 1 - thallium stress result of 2 (reversi) - effect ~ 0.18 - it had positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has thalassemia of 2 (reversi).
5) slp_1 = 0 - slope of the peak exercise ST segment of 0 (upsloping) - effect ~ 0.12 - it had a positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has slope of the peak exercise ST segment of 0 (upsloping).
etc.
The local prediction for patient 56 is ~0.855 probability of NOT having a heart disease. The local prediction is very close to the model's prediction, which is ~0.99 probability of NOT having a heart disease. The model's local prediction is correct, as the patient does not have a heart disease.
To calculate the local prediction based on the LIME explanation, we need to multiply the effect values of the features that are present in the explanation by the values of these features for the given patient. Then we need to sum all the products. The result is the local prediction for the given patient.
As all the features present in the explanation are binary, we can simply sum the effect values of the features that are present in the explanation and add the interecept. The result is the local prediction for the given patient.
explanation.result['effect'].sum() + explanation.intercept[1] = 0.855
The second patient, number 167, is a Female aged 62 years, with chest pain type 0 (typical angina), resting blood pressure of 140 mm Hg, cholestoral of 268 mg/dl, fasting blood sugar of 0 (false), resting electrocardiographic results of 0 (normal), maximum heart rate achieved of 160, exercise induced angina of 0 (no), ST depression induced by exercise relative to rest of 3, the slope of the peak exercise ST segment of 0, number of major vessels of 2, and thalassemia of 2 (normal).
The model predicts that the patient has a ~0.019 probability of not having heart disease. The model's prediction is correct, as the patient actually does have a heart disease.
According to LIME explanation, the most important features that led to a prediction of heart disease for patient 167 are:
1) caa_0 == 0 - number of major vessels of 0, effect ~ -0.39 - it had negative attribution to the prediction, meaning that the patient has higher probability of having heart disease when he has no major vessels.
2) cp_0 = 1 - chest pain of type 1 (atypical angina), effect ~ -0.24 - it had negative attribution to the prediction, meaning that the patient has higher probability of having a heart disease if type of his chest pain is atypical angina.
3) oldpeak > 1.60 - ST depression induced by exercise relative to rest, effect ~ -0.19 - it had negative attribution to the prediction, meaning that the patient has higher probability of having a heart disease if his oldpeak measure is higher than 1.60.
4) sex_0 = 1 - being female, effect ~ 0.18, it has a positive attribution to the prediction, meaning that the patient has a lower probability of heart disease if he is female.
etc.
Let us now compare the LIME explanations for 100 randomly chosen patients. We will use the XGBoost model to predict the heart disease of these patients. We will calculate basic statistics for the effect values. We will also plot the distribution of effect values for selected features. This way will be able to see if there are any systematic differences between the explanations.
If we look closely on the row conatining standard deviation of each feature, we can see that the values are very close to zero. It means that the attributions of each features do not vary across many patients.
Let look closely at feature '0.00 < caa_0 <= 1.00 (caa_0 == 1)`. 62 patient out of 100 sampled had 0 major vessels. If we look at boxplot of the effect values of this feature, we can see that:
1) the minimum contribution value was 0.386 while maximum was 0.398, a small diffrence between these values suggest that this feature has similiar contribution to a model prediction for many patients.
2) small value of standard deviation also suggest the above thesis.
We will now run the LIME explanation for the same patient 100 times and see how stable the explanations are. We will calculate the mean and standard deviation of the effect values for each feature. We will also plot the distribution of effect values for each feature. This way will be able to see if there are any systematic differences between the explanations.
The results from table say, that standard deviation of features is very small (close to 0), meaning that there is little variability in terms of effect values, when running the algorithm for diffrent seeds.
We can conclude that the method is stable, and not susceptible to initialization.
Let us now compare explanation derived from SHAP and LIME for the same patient. We will decide whether these methods produce substantially diffrent results.
It is worth noting however, that the methods explain models in diffrent ways. SHAP is a global method, while LIME is a local method.
SHAP takes into account all the features in the dataset, while LIME takes into account only a subset of features that are present in the explanation.
SHAP also takes into account the interactions between features, while LIME does not.
These differences in the methods make it difficult to compare the results.
LIME Explanation:
The most important features that led to a prediction of no heart disease for patient 56 are:
1) caa_0 == 1 - number of major vessels of 1 - effect ~ 0.39 - it had positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has 1 major vessel.
2) cp_0 == 1 - chest pain of type 1 (atypical angina) - effect ~ -0.24 - it had negative effect on the prediction, meaning that the patient has a higher probability of having a heart disease if type of his chest pain is atypical angina.
These were two most important features in the LIME explanation.
SHAP Explanation:
1) caa_0 == 1 - number of major vessels of 1 - increased average response by ~ 0.241 - it had positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has 1 major vessel.
2) thall_2 = 1 - thallium stress result of 2 (reversi) - increased average response by ~ 0.101 - it had positive effect on the prediction, meaning that the patient has a higher probability of not having a heart disease if he has thallium stress result of 2.
For both of theese methods, the most important features were caa_0 == 1. However, the rest of the features were diffrent. For SHAP, the second most important feature was thall_2 = 1, while for LIME it was cp_0 == 1. Meaning that, according to SHAP, thallium stress result of 2 was more meaninfgul information to conclude whether patient has a heart disease than information about chest pain of atypical angina. For LIME however, the opposite was true.
To see if SHAP is stable, we will run the SHAP explanation for the same patient 100 times and see how stable the explanations are. We will calculate the mean and standard deviation of the effect values for each feature. We will also plot the distribution of effect values for each feature. This way will be able to see if there are any systematic differences between the explanations.
The standard deviation for some feature is very high (age = 48, caa_0 = 1) meaning that the effect values for these features vary a lot. This is not a good sign, as it means that the method is not stable. However, recall that SHAP takes into account interactions between features. This means that the effect values for some features will vary a lot, as they are dependent on other features.
Age of 48 feature, according to the results below, can be both positively and negatively attributed to the prediction.
chol == 222 feature, according to the results below, can be both positively and negatively attributed to the prediction.
We will now compare the LIME explanations for 100 randomly chosen patients. We will use the Logistic Regression and GBM models to predict the heart disease of these patients. We will calculate basic statistics for the effect values. We will also plot the distribution of effect values for selected features. This way will be able to see if there are any systematic differences between the explanations.
The LIME explanation for both of the models are very similar. The most important features are the same, and the effect values are very similar. This is a good sign, as it means that the method is stable.
In this notebook, we investigated in detail the LIME method. We were able to see how the method works, and how it can be used to explain the predictions of a model. We gave explanation of how the local prediction of LIME method is calculated. Furthermore, we checked the stability of the LIME method in terms of
1) the stability of the explanations for the same patient
2) the stability of the explanations for the same model
3) the stability of the explanations for the same feature across many patients
We concluded that the LIME method is stable in terms of 1), 2) and 3).
We also compared the LIME method with SHAP, and we were able to see that the results are similiar only for the most important feature, the effect values for the rest of the features varied a lot. This is not a good sign, as it means that the method is not stable. However, we gave an explanation to that: SHAP takes into account interactions between features. This means that the effect values for some features will vary a lot, as they are dependent on other features.